Integrated Annotation For Biomedical Information Extraction

نویسندگان

  • Seth Kulick
  • Ann Bies
  • Mark Y. Liberman
  • Mark Mandel
  • Ryan McDonald
  • Martha Palmer
  • Andrew Schein
  • Lyle H. Ungar
  • Scott Winters
  • Peter White
چکیده

We describe an approach to two areas of biomedical information extraction, drug development and cancer genomics. We have developed a framework which includes corpus annotation integrated at multiple levels: a Treebank containing syntactic structure, a Propbank containing predicate-argument structure, and annotation of entities and relations among the entities. Crucial to this approach is the proper characterization of entities as relation components, which allows the integration of the entity annotation with the syntactic structure while retaining the capacity to annotate and extract more complex events. We are training statistical taggers using this annotation for such extraction as well as using them for improving the annotation process.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A semantic-based workflow for biomedical literature annotation

Computational annotation of textual information has taken on an important role in knowledge extraction from the biomedical literature, since most of the relevant information from scientific findings is still maintained in text format. In this endeavour, annotation tools can assist in the identification of biomedical concepts and their relationships, providing faster reading and curation process...

متن کامل

Annotation of anaphoric relations in biomedical full-text articles using a domain-relevant scheme

Biomedical literature has been the focus of relevant information extraction projects, however there is no corpus of full scientific articles annotated with anaphoric links for training and evaluation of anaphora resolution systems—which are an important part of information extraction efforts—for this domain. We have created a corpus of biomedical articles that are annotated with anaphoric links...

متن کامل

Open Ontology Forge: A Tool for Ontology Creation and Text Annotation Applied to the Biomedical Domain

In this paper, we will introduce Open Ontology Forge (OOF), a software tool for ontology creation, terminology annotation, and coreference annotation by experts applied to the biomedical domain. Encoding expert’s knowledge of this domain in a consistent and machine understandable way is important in order to make the knowledge publicly available and improve the quality of information extraction...

متن کامل

An Evaluation of Annotation Tools for Biomedical Texts

Biomedical texts are a rich information source that cannot be ignored. There are several text annotation tools that may be used to extract useful information from these texts. However, the multi-domain characteristic of these texts, and the diversity of ontologies available in this area, demands a careful analysis before choosing an annotation tool. This work presents an evaluation of the exist...

متن کامل

A Curation Pipeline and Web-Services for PDF Documents

The continuous growth of the biomedical literature and the need to efficiently find and extract information from its content led to the development of various text mining tools. More recently, these tools started being integrated in user-friendly applications facilitating their use by expert database curators. However, these tools were mainly designed to extract information from text based docu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004